Bank Marketing Dataset

Bank Marketing Dataset

In the previous article, we went through a PyTorch MLP model. In this article, we try to improve the results by creating a Keras MLP model using Tensorflow.

Train and Test Sets

Similarly here, we use StratifiedKFold to keep the same percentage of samples of each target class as the complete set.

Our model here utilizes the accuracy and recall scores.

Confusion Matrix

The confusion matrix allows for visualization of the performance of an algorithm.

Some of the metrics that we use here to mesure the accuracy: \begin{align} \text{Confusion Matrix} = \begin{bmatrix}T_p & F_p\\ F_n & T_n\end{bmatrix}. \end{align}

where $T_p$, $T_n$, $F_p$, and $F_n$ represent true positive, true negative, false positive, and false negative, respectively.

\begin{align} \text{Precision} &= \frac{T_{p}}{T_{p} + F_{p}},\\ \text{Recall} &= \frac{T_{p}}{T_{p} + F_{n}},\\ \text{F1} &= \frac{2 \times \text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}\\ \text{Balanced-Accuracy (bACC)} &= \frac{1}{2}\left( \frac{T_{p}}{T_{p} + F_{n}} + \frac{T_{n}}{T_{n} + F_{p}}\right ) \end{align}

The accuracy can be a misleading metric for imbalanced data sets. In these cases, a balanced accuracy (bACC) [6] is recommended that normalizes true positive and true negative predictions by the number of positive and negative samples, respectively, and divides their sum by two.

Conclutions

Comparing the results, we can see that the bACC here is slightly here from our previous model. Moreover, other accuracy parameters showed better results.


Refrences

  1. S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014

  2. S. Moro, R. Laureano and P. Cortez. Using Data Mining for Bank Direct Marketing: An Application of the CRISP-DM Methodology. In P. Novais et al. (Eds.), Proceedings of the European Simulation and Modelling Conference - ESM'2011, pp. 117-121, Guimaraes, Portugal, October, 2011. EUROSIS. [bank.zip]

  3. Scikit-learn Precision-Recall

  4. Mower, Jeffrey P. "PREP-Mt: predictive RNA editor for plant mitochondrial genes." BMC bioinformatics 6.1 (2005): 1-15.

  5. Precision and recall wikipedia page